{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Unsupervised Machine Learning with Scikit-learn\n",
    "\n",
    "\n",
    "#### Types of ML\n",
    "\n",
    "![](ml_types2.png)\n",
    "\n",
    "+ Source: tdsc\n",
    "\n",
    "\n",
    "Unsupervised ML:  use of unlabeled data for training \n",
    "+ The systems tries to learn without a teacher\n",
    "+ You are not given a target value/label that you are to predict the outcome as\n",
    "+ We only have the independent variables(x) and no target/dependent variable(y) in these problems.\n",
    "\n",
    "\n",
    "#### Libraries\n",
    "+ pandas\n",
    "+ scikit-learn\n",
    "+ pycaret\n",
    "+ hdbscan\n",
    "\n",
    "#### Types of UML Algorithms\n",
    "+ Clustering\n",
    "+ Anomaly Detection:outlier detection\n",
    "+ Novelty Detection:\n",
    "+ Visualization\n",
    "+ Dimensionality Reduction:simplify data without losing too much info by merging correlated features into one\n",
    "+ Association Rule Learning:discovering interesting relations btwn attributes\n",
    "\n",
    "    \n",
    "#### Clustering Unsupervised ML Task\n",
    "+ Clustering:\n",
    "    - The process of grouping dataset in to groups in such a way that similar datapoints are in the same group\n",
    "    - Usefulness\n",
    "        - EDA\n",
    "        - Pattern Recognition\n",
    "        - Image Analysis\n",
    "        + For working with unlabelled data\n",
    "        + Search engines image search\n",
    "        + Customer segmentation\n",
    "        + Market segmentation\n",
    "        + Outlier Detection\n",
    "        + clustering similar documents together, \n",
    "        + recommending similar songs or movies,\n",
    "\n",
    "\n",
    "#### Types of Clustering\n",
    "+ Flat vs Hierarchical\n",
    "+ Centroid Based vs Density Based\n",
    "\n",
    "\n",
    "![](mlclassifiers.png)\n",
    "\n",
    "\n",
    "#### Basic Principle Behind Clustering\n",
    "All are based on different distance measures. \n",
    " \n",
    " + K-Means (distance between points), \n",
    " + Affinity propagation (graph distance), \n",
    " + Mean-shift (distance between points), \n",
    " + DBSCAN (distance between nearest points), \n",
    " + Gaussian mixtures (Mahalanobis distance to centers), \n",
    " + Spectral clustering (graph distance), etc.\n",
    "\n",
    "#### Terms\n",
    "+ Centroid: a data point at the center of a group/cluster\n",
    "\n",
    "\n",
    "#### Data Source\n",
    "+ https://archive.ics.uci.edu/ml/datasets\n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Clustering using KMeans Algorithm\n",
    "+ Kmeans(Lloyd Forgy Algorithm)\n",
    "+ MiniBatchKmeans\n",
    "\n",
    "#### Benefits\n",
    "+ Fast and Scalable\n",
    "\n",
    "\n",
    "#### Demerits\n",
    "+ Need to know th number of clusters\n",
    "+ problem of having to pre-define the number of clusters.\n",
    "    - Elbow Method/Silhoutte Method\n",
    "    - Hierarchical Clustering:\n",
    "+ Varying sizes and different densities may affect performance\n",
    "    - scaling\n",
    "\n",
    "\n",
    "#### Terms\n",
    "+ k = number of clusters\n",
    "+ .label_: index of cluster \n",
    "+ .cluster_centers_: centroid\n",
    "+ .inertia_: performance metric of mean sq distance btwn each instance and its closest centroid\n",
    "+ .score:\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load EDA Pkgs\n",
    "import pandas as pd\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/usr/local/lib/python3.7/dist-packages/statsmodels/tools/_testing.py:19: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.\n",
      "  import pandas.util.testing as tm\n"
     ]
    }
   ],
   "source": [
    "# Load Visualization\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Task\n",
    "+ Find the different clusters\n",
    "+ Customer segmentation\n",
    "    - Different types of customer/buyers in a market"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Working with Kmeans"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### How to Find the Optimal K for the cluster\n",
    "+ Elbow Method\n",
    "+ Silhouetter Method\n",
    "\n",
    "\n",
    "#### How to choose the best k - number for your cluster\n",
    "+ Elbow method\n",
    "+ Silhoutte score\n",
    "    - +1 : close to 1 means that datapoint is inside it own cluster\n",
    "    - 0 : close to 0 means it is close to a cluster boundary\n",
    "    - -1 : close to 1 means it is in a wrong cluster\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Elbow Method\n",
    "# Use the sum of squared distance and find where there is a bent\n",
    "\n",
    "sum_of_sqr_distance = []\n",
    "k_range = range(1,15)\n",
    "for k in k_range:\n",
    "    km_model2 = KMeans(n_clusters=k)\n",
    "    km_model2.fit(df)\n",
    "    sum_of_sqr_distance.append(km_model2.inertia_)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Using Silhouette Method for finding Optimal K\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "For n_clusters = 2, silhouette score is 0.6547312337772452)\n",
      "For n_clusters = 3, silhouette score is 0.7393168523275077)\n",
      "For n_clusters = 4, silhouette score is 0.7207226008574878)\n",
      "For n_clusters = 5, silhouette score is 0.6382039175207864)\n",
      "For n_clusters = 6, silhouette score is 0.5801124274809953)\n"
     ]
    }
   ],
   "source": [
    "range_n_clusters = [2, 3, 4, 5, 6]\n",
    "for n_clusters in range_n_clusters:\n",
    "    km_models3 = KMeans(n_clusters=n_clusters)\n",
    "    preds = km_models3.fit_predict(df)\n",
    "    centers = km_models3.cluster_centers_\n",
    "\n",
    "    score = silhouette_score(df, preds)\n",
    "    print(\"For n_clusters = {}, silhouette score is {})\".format(n_clusters, score))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Plotting the Clusters"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Scatter Plot with Centroids"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "jupyter": {
     "source_hidden": true
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.legend.Legend at 0x7fc935896950>"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZcAAAEGCAYAAACpXNjrAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO2df3xcZZnov0+maSENliatXKBkglphW34UW1lYRFtBhKKiuLp0g0aqBFtFWPfqRXL3FvDGj7ruVWChWpdC2cb6C0TAsoi1XRVRCVppy68WSEJqLW1KW9pSmmae+8d5T3IyOWfmTGYmM0mebz/nMzPPec85z0wn55nnfX68oqoYhmEYRiGpKLUChmEYxujDjIthGIZRcMy4GIZhGAXHjIthGIZRcMy4GIZhGAVnXKkVGG6mTJmi9fX1pVbDMAxjRPHEE0/sVNWpccePOeNSX19PW1tbqdUwDMMYUYhIRy7jbVrMMAzDKDhmXAzDMIyCY8bFMAzDKDhjLuYSRk9PD11dXRw8eLDUqowIjjjiCKZNm0ZlZWWpVTEMo0wx4wJ0dXVx1FFHUV9fj4iUWp2yRlXp7u6mq6uLE088sdTqGIZRphRtWkxEThCRtSLylIhsEpFrnLxGRB4Rkc3ucbKTi4jcIiJbRORJEXlb4FyNbvxmEWkMyGeLyAZ3zC0yRMtw8OBBamtrzbDEQESora01L88wAFpbob4eKiq8x9bWUmtUNhQz5nIY+GdVnQGcBXxGRGYA1wFrVHU6sMa9BrgImO62JmApeMYIWAL8LXAmsMQ3SG7MlYHjLhyqsmZY4mOflWHgGZKmJujoAFXvsanJDIyjaMZFVbep6h/d81eBp4HjgUuAFW7YCuCD7vklwN3q8TvgaBE5Fngv8Iiq7lLVV4BHgAvdvjeo6u/UWzfg7sC5DMMwiktzMxw4MFB24IAnN4YnW0xE6oEzgN8Dx6jqNrfrr8Ax7vnxwEuBw7qcLJO8K0Qedv0mEWkTkbYdO3bk9V7Kld27d3P77bfnfFxbWxuf+9znQvfV19ezc+fOfFUzjNFJZ2du8jFG0Y2LiFQD9wDXqure4D7ncRR9tTJVXaaqc1R1ztSpsbsXjCgyGZfDhw9HHjdnzhxuueWWYqllGKOXurrc5GOMohoXEanEMyytqnqvE293U1q4x5edfCtwQuDwaU6WST4tRD4sHNxzkNtm3sbBPYUJbN99992cdtppnH766XzsYx9jx44dfPjDH+btb387b3/723n00UcBuOGGG1i4cCFz587lTW96U59huO6663j++eeZNWsWX/jCF1i3bh3nnnsuH/jAB5gxYwYHDx7kiiuu4NRTT+WMM85g7dq1AKxbt473ve99AHR3d3PBBRcwc+ZMPvWpT2GrlBpGBlpaoKpqoKyqypMbXmppMTZA8OIg30qT/ytwnXt+HfB19/xi4CF33FnAH5y8BngRmOy2F4Eat+8Pbqy4Y+dn02v27NmazlNPPTVIlo0nW5/UG7hBn/zekzkfm87GjRt1+vTpumPHDlVV7e7u1gULFuivf/1rVVXt6OjQk08+WVVVlyxZomeffbYePHhQd+zYoTU1NXro0CF98cUXdebMmX3nXLt2rVZVVekLL7ygqqrf+MY39IorrlBV1aefflpPOOEEfe2113Tt2rV68cUXq6rq1VdfrTfeeKOqqj744IMK9OmUzlA+M8MYdaxcqZpMqop4jytXllqjogG0aQ42oJh1LucAHwM2iMh6J7se+CrwQxH5JNABfNTtWw3MB7YAB4ArAFR1l4h8GXjcjbtJVXe554uBu4Aj8YzLQ0V8PwDc84/38Oz9z9L7ei8A9338Ph648gFO+sBJfPh7Hx7SOX/5y1/ykY98hClTpgBQU1PDL37xC5566qm+MXv37mXfvn0AXHzxxUyYMIEJEybwxje+ke3bt4ee98wzz+yrRfnNb37D1VdfDcDJJ59MMpnkueeeGzD+V7/6Fffee2/fNSZPnoxhGBloaPA2YxBFMy6q+hs8jyKM80LGK/CZiHMtB5aHyNuAU/JQM2fm3TSPv67/K7vbd5M6nKKisoKjk0cz78vzCnqdVCrF7373O4444ohB+yZMmND3PJFIRMZUJk6cWFCdDMMoIq2tXqZZZ6cXt2lpGdGGy3qL5UjNW2qYd9M8Uj0pKidWkupJMffGudS8uWbI53z3u9/Nj370I7q7uwHYtWsXF1xwAbfeemvfmPXr10cdDsBRRx3Fq6++Grn/3HPPpdXl3z/33HN0dnZy0kknDRjzzne+k+9973sAPPTQQ7zyyitDej+GYeTIKKyZMeMyBDb9cBOVEyuZe+NcKidWsulHm/I638yZM2lubuZd73oXp59+Op///Oe55ZZbaGtr47TTTmPGjBl8+9vfzniO2tpazjnnHE455RS+8IUvDNq/ePFiUqkUp556Kv/wD//AXXfdNcADAliyZAm/+tWvmDlzJvfeey91lvViGMPDKKyZER1jGUFz5szR9MXCnn76af7mb/4m9jm2Pr6VSXWTqD6mmn3b97H3pb0cN+e4Qqta1uT6mRnGmCOXaa5MXS/K5B4tIk+o6py4461x5RA4/u39tZrVx1RTfUx1CbUxDKPs8Ke5fG/En+aCcAOTSEBvb7h8hGLTYoZhGIUm12muMMOSST4CMOPiGGvTg/lgn5VhZCHX1jCZPJQR2m3ZjAve4lfd3d1204yBuvVcwlKkDWPUE7fFfq6tYTJ5KB0dsHAhTJkyolr7W8wFmDZtGl1dXYzWppaFxl+J0jDGFLnEUVpaBo6Fwa1hggH/qJiLz6FD4EoVssZvygTLFjMMw4hDfb13Y08nmYT29sHyTNli6YZqKERdt0jkmi1m02KGYRg+6dNeixf3vw4zLODJw6arGhq8m38q5T0GvYywgH+ulHlrfzMuhmEYEF4lv3Rp/+tM+OMvv9yrWRGB88+PHl8IwxCM35ThcstmXIyC07qhlfpv1VNxYwX136qndUPpv+iGkZVCeBNB1qyJNjC5dL+orfWMRpDKyv74TZm2jjHjYhSU1g2tND3QRMeeDhSlY08HTQ80mYExypPgL/6oaa98WLMmXB62FkwYixbBzTfDuLTcq2BFf5m2jrGAvlFQ6r9VT8eewX+kyUlJ2q9tH36FDCOK1la44gro6SnaJRRI3CDUTaqj5bwWGk4NxF2CAf+KivBssWTSe8yUSFBRET5tJ+LFewqEBfSNktK5J3wuOUpuGCXjmmuKalh8fA/+F1++gn3HBWpVoD/gH5WG3NERHZ/xEwnSp8x8Stx41oyLUVDqJoV/oaPkhlEy/LqRfMlQXa9A7w3w8tdg6T09VG/r7o+LXHFFf2FkpnNnMhKq4YapDJZbLppxEZHlIvKyiGwMyH4gIuvd1u6vUCki9SLyWmDftwPHzBaRDSKyRURuEfEmG0WkRkQeEZHN7tGWTcyDQgXhW85roapy4FxyVWUVLefZuuLGKKSqClasiOxqXOG2qa/BEekzVD09noHLFJro7Y0fn0kkPD2SSVi2rOQFlsX0XO4CLgwKVPUfVHWWqs4C7gHuDex+3t+nqp8OyJcCVwLT3eaf8zpgjapOB9a418YQKGQQvuHUBpa9fxnJSUkEITkpybL3Lxs412wY5UBt7dCP9W/ijY1e3KSYsevmZjj77OwdklOp8JqaElHUgL6I1AMPquopaXIBOoF3q+rmDOOOBdaq6snu9QJgrqpeJSLPuufb3Lh1qjpwacUQLKA/GAvCG2OS1lavZ9ehQ7kdV1sLO3cWpsq+kBS5Yn+kBPTPBbar6uaA7EQR+ZOI/LeInOtkxwNdgTFdTgZwjKpuc8//ChwTdTERaRKRNhFps/5hg7EgvDEmaWiA5cu9m7JIeD1JGLt2eY+FrovJl/nzS63BAEplXBYAqwKvtwF1qnoG8HngeyLyhrgnU8/9inTBVHWZqs5R1TlTp04dqs6jFgvCGwZQXQ1XXdVvbKKmofwMrWLUxeTD6tWl1mAAw25cRGQccCnwA1+mqq+rard7/gTwPPBWYCsQbL87zckAtrvpMH/67OXiaz86sSC8MSYJq2xfscILoKdS3vOwQHpvb9ksPTyAMus1VgrP5XzgGVXtm+4SkakiknDP34QXuH/BTXvtFZGzXJzm48BP3WH3A43ueWNAbsQgmB3WvKaZxtMbixaET89EW/yzxdYexig9UZXtjY2eZ9LczDPvO5uuyQlSwOFyL9yoqSm1BgMoWkBfRFYBc4EpwHZgiareISJ3Ab9T1WC68YeBm4AeIOXGPuD2zcHLPDsSeAi4WlVVRGqBHwJ1QAfwUVXdlU0vC+j3Z4cd6On/w6qqrCpKVlfYtdIZnxjP8kuWW0aZMbxEVbYH6BGXTuyGhSccDyN+U8ywyns/0aBol84toG/tX8YgUdlhCUmQ0lR4q4oCXyud2iNr2fnF4v1hGMYgpkwpXCFlCAeZwB18kk9yB0fwev4n9LPBImpqgKJO142UbDGjhERlgfVqb+w6l7hFl3EzzrpfK94fuWGUgs28lZ28kc1Mj3fA+PGZa2/KLBssG+a5jEHiehNRdS5hU12CoCjJSckBXk/cawHokrH1XTRKTIxpsaFwD5fyLCfTS4IUCSroJUEvJ/EMHx5QNw5MmODV2dTVwVveAuvWRfcZM8/FKBTFWhdl/vR4v4CivI7mNc2DYijqMsE79nSw8KcL+3QNy0QLo/bIPKqlDSNI3IWzCt3Y0d3057GOSeyhAs9IVNDL0exmHmsHH/P66178ZP58rz1/lGGB/mywqBTpbBX8w4wZlzKlEC1ZFv9sMeNuGofcKIy7aRzn330+9d+qZ2nb0ljHR9W5ZJvqOtR7iGseugYY3A6m9shaKtK+dpUVldx80c2xdDKMjMRZOMs3PoWqU0kkvGt9+tOQSFDDLuaxlhQJKnmdFAnmso4aXok+x7Jl2a/jG8OmpvD9UfISYcalTAnzDg70HKB5TbwFgBb/bDFL25bSq94voV7tZc2La2JPUaXXuQS9qArJ/rUJxlAaTm2g/dp2UktS7PziTu6+9O4Bac93fvBOyxQzCkO2hbOCxqdQ9PZ6XsvSpX2exyZmUkkPc1lHJT1sYoZXOv4i0OseF7jjo9ZyCRLscnzOOYO9lETCk5cRFnMpUypurOibagoiCKkl2RcAGnfTuD7DEvuaUoGqDsoWi5NOHIbFUIxhJ9vCWYX0WDKwleOYxB6q2c8+JrL3gkkcd+9fYGJg0H68lryrIk7ik0x6hsVvRhn1Hqy32Ngmbhwl35YsuRoWgMlHTCa1JEX7te0DPIkwLyobucRQihVbMsYgUXEUXz5MVezH8xeq2Q9ANfs57jtphgW811+JcbL0LsdR78Eq9McuucRR8m3JkpDcg3u7XuuvQQ3e8ONOpaWfy4/1LP7Z4shxhWz3bxiha58Ep5RKtTpj1GWHok42A1ommHEZRnKJo4Sti9J4eiPNa5pj/cJvmp17cM/3ilo3tLLwpwv7bvhDwT+uV3tZ2rY00sDkG1syjAE0NHjBcb/5ZPrCWenGJyoOUmiinIpszsbEdHeH7Aa0TLCYyzCSTxxlKC1bzr/7fNa8uKbv9XHVx7F9//bQKbPKisq+wPqUr08ZVNS44BT4ynlQNwk698D1a2DVxkGnyYggg+I5+caWDCNnWlu9AP85HfBdIHifjhsHyZUFeNfKNeYS1dLFfw+dnZ7HEozJFAmLuZQx+cRRcv2F37qhlce6Hhsg+8u+v0TGYiRQmBVmWL77fqg/GirEe/zu+z15LoRNe1m7f2PYaWjw4hityYGGBbyb/930ezKXS1/9Sl4/w1fhGZJ2vO6J7cQzYrsi2iX676GMVp5Mx4zLMJJPHCXOgl7BOEnjTxpzCsIf6j0Uaai+ch5MHD9QNnG8Jx8KQaNo7f6N0hExJzUO785YD3xb4TLPrAgFMDAnAgn3GMc7KrM4Si6YcRlG8llfPtsv/NYNrTT+pLEvTjKUbLGOPR1U3Dj4K1E3KUqnnC/Rh28U8/lMDCM/Yty40zK6BK/1fgpnaIods+noyNxloIwx4zLMBAsK01N+M5HtF/5VD1w1JIOSTlj8o3NP+NgoeRwqpKIvMQEY0mdiGKHEbf9CC4PnxUJIs0HjVKhQRa6u9eIo9fR7Ot+lOAYmvcvACMCMywgh2y/8/T37i3bt69fA/kMDZfsPefKhkksHZsOITZz2L300AMuAJCDRc17peSX+VNVXGHrtSq4EuwyMEMy4lDHpxYVQml/4qzbClQ9A+25Iqfd45QO5ZYslJIEgofU3lnpsFIxs7V8G0UBflD3KuATvkpWVsG+f5xVVRSwTUawwSbBIMrZ3VjqKZlxEZLmIvCwiGwOyG0Rkq4isd9v8wL4vicgWEXlWRN4bkF/oZFtE5LqA/EQR+b2T/0BE0kLOIxPfoMiNwsfu/VjG4sKg8QljwSnw4jXQ+3+8x1yzu4Ks2ggn3gyJm7xH37BMODiBxf++mAkHJ2Q8PqUpUktSpDQ8vTjuui+GkZF8qtcj+kpqt+e8dFcJvai3wJjq0GtXMrXMz4TvMeXknVEyQ1RMz+Uu4MIQ+TdVdZbbVgOIyAzgMmCmO+Z2EUmISAK4DbgImAEscGMBvubO9Ra8r8Uni/hehoVgtToMjn8Ef+GnV7anU6j04Wy89bm38sadb2T65swLItUc6a3vnWvqsbWGMXIin+r1idWh4lcnQOIGeLVSSfz94f4A/kQYtMDkfuD6DNdIJMKXKM5GsEgyF+8sV0NUQIpmXFT1V0DWNe0dlwDfV9XXVfVFYAtwptu2qOoLqnoI+D5wiXhFGe8GfuyOXwF8sKBvoATE6eHl/8LPNjbX9OEKKqg9shaJuUr4pT++lOtbrueD93kf+4d+8iGub7meS398aej4Vw6+QsWNFew7tI/xiYGKRaUeW2sYI2fyqV4/IjxuWX2U91g3n4EB/Kl4U2k7iF+74rfFr63NnGkmAtXV4V0GcvHOcp4mLByliLl8VkSedNNmk53seOClwJguJ4uS1wK7VfVwmjwUEWkSkTYRaduxY0eh3kfBiTM15P/CzzY2l/Th5KQkd196Nzu/uDN2Vfy6eevYM2kPvRWupX9FL7uP3s3ad4csiIQ3LaYo3a91o6p9hixT6rG1hjFyJlv7l0zsqwkVd7tbRuqrDA7gH4HnrWSrXUkkYNEiuP127/V9Hw3PNPv1Is/DSKXg1VfDiyRz8c5K2ORyuI3LUuDNwCxgG/Bvw3FRVV2mqnNUdc7UqVOH45JDIltVevAXfraxxUgfDrKrdhdr560lkUrweuXrJFIJ1s1dxys1GRZEcvSkeqgeX501MSFO4ahhDGKo1evXA+nOS2Caq2Ja+GFal6G4Mpn0jMXhw/2GBeAdq8Mzzd6xOrueuXhnJWxyOazGRVW3q2qvqqbw7PSZbtdW4ITA0GlOFiXvBo4WkXFp8hFNWC2LP02V/gs/2/LBuaQPp083xe2oPHPTTHoqe1g3bx09lT3M2DQj+0GOXLy0dII1MjZFNhZpZeBP/gJ9B/59V2iLlto7vN2de8MP69ibwbhEeghDzQYgN++shE0uh9W4iMixgZcfAvxMsvuBy0RkgoicCEwH/gA8Dkx3mWHj8YL+96vXbXMt8Pfu+Ebgp8PxHopJWC3Lf176n+gSHfQLP2zseSee12cYck0fvuSkA7zjhMtJqfD853pjBf4fPedRbr36Vh77u8e49epb+e05v439XuP0DosyoFYjM5ZpBZqADrxbeod7XYDvQF1daIuWl9xU8v9eI7x+eNyAQ/wfbJ1R3SqCHkIwa6sr6tYb06OI653lM02YJ0Xriiwiq4C5wBRgO7DEvZ6F961oB65S1W1ufDOwEDgMXKuqDzn5fOBbeP/dy1W1xcnfhBfgrwH+BFyuqum5G4MYKStR5ovcGD/d0c8sCyYA7D+UvZYlIQl6tRdBcm7Nv2jOIm6/+Pas41o3tNK8ppnOPZ1USEVoF4LkpCTt17bndH1jpFKPZ1DSSeLdUvLAz6wKBMD3V8KV74dVp3mvP3F6JbfOfwPV43ex71ANV6/ey11/7uHWB+EzbQxMh1kA3FoNtfu9eM7Ve+Gunv596V2SqcIr6izPDhW5dkW2lvujlKhW9mG8eI2XqpxO+26vpiWM8YnxHDX+qEEdlOMyFINg7fkNb7Il7HstDC6lj0Mr0Iw3HVUHv5kPl6+Gzk66jq7gi/N6+wwLwIIn4etrE0zbnYK6On7z6flcfsRq1t3QQX0wnhmnxf4C4OsJmJbyrk0L0YYlTc+MY4uDtdwfwwRrQnLxJOJmlvlV9rVH1qKqQzYsMLSgvMVgjMIu6RgyxfaOFdDeAqkUddekBhmW7z4A017p7asZeceXV9A+tYX6vWkzBXFaw6wC6lL0B3ggPJZUxKnAImLGZZSQragyE3Ezy5pmN5FakqJ6fDU9qZ4hauoxlPVaLAZjhDebrHLyXGkG0mvFDjj54O/oV9bAxPSvvV8zUlc3sG4lGXHJ9K99jZ/+nMmAZNazXDHjMkqIU4AZRdzMstWbvTTJfFOBh7peS3oSg/UpG4ukNZskydDjFJkztlrOa+ETp1f2tVBKrie843FnJ6ycP7BuJSrkGfmnk8mA5JFZVkIs5jJKyHUqLJ24yxgLEhlYz0bYMsf5YDEYIz/qyZwc0Mrh1ELGVQR+eYUtTZxMulmtsHMFCDtWxLWDiYolgZfLFPb3VoAkhhzINeYyLvsQYyRQN6murydZED9wns34rNoYr8vxUBciK0ZGV9R7tiWSjXi04E09BT2G4BRb80DDAv1xE99A9NWMfCz8Euq2TrxizPQK/r5U5TqijVPY39tQpwKHD5sWGyXMnz4/o9xvHFkqovTLB1si2ciPbFNsEdNOdfTXjDzcCA3NRHodmX6HjR8fKGaMs3BZIkLP8sQ8l1GCHw+JKx9uiqGHP7Xm18EUcsrNGCs0EH2TjvAm/pIAUvCBfXDWHcChwWPAszf+HbYeLyYD/d7LgJCEr0Nz+DXBu+aQ0q1Lg8VcRgnZ4g/5xmTyxeIgxsjDz+AKTJsF4yYv4hmNUCLiJO14lf8+yaRXYT+AeopWKJoHVucyRsm2TkrcOIQgjK8o/LprFgcxRh5p02ZdiYEB+civdIaCzvRjQnuPFTLdunSYcRklhMUfKisq2XdoX986KpUVlVnPk1qS4vV/eZ3zToxY+CVA3AaXFgcxRi6BZZDrUgMD8pGZwHVEWp70Y0K7Excy3bp0mHEZJaTXgNQeWYuIeOunuHVURDx5HH7x8V+gSxRdoqy8dGVo4HzFh1ZkXFws23othjGiSDcEYS36+zyMEO8jfZXKjN2JA0aNdkaaYQEzLqOKhlMbaL+2va+K/lDvwEDjod5DVI8PX8o123nTOzD7BiNquis5KZl1vRbDGFGkt69fBXy2EvbVMtjDCPE+/rQIfjv83YlLhWWLjVIyLbRVe2RtaF+woFcT7EZcN6kuMpW45bwWmh5oGtAdwKbBjFGJbwiam71YSV0dnN8C1VEGIi0T7R2UMh4/7JhxGaVEFRhWSEWoYamsqOTmi7wWyH6fMt9gdOzpYGnb0r6xfg8vsHRgY4zR0DCqvY1CYqnIo5R0AxGGvw5LclJygEGo/1Z9qGFKx9ZRMYyxQ9mkIovIchF5WUQ2BmT/KiLPiMiTIvITETnayetF5DURWe+2bweOmS0iG0Rki4jcIiLi5DUi8oiIbHaPk4v1XkYicZo8+oYlPS4StzGlrWVvGPlQpOWay4RiBvTvAi5Mkz0CnKKqpwHPAV8K7HteVWe57dMB+VK87PLpbvPPeR2wRlWnA2vcayNAMMCf0vC8+zADEbcmxWpXDGOojMw1WnKhaMZFVX8F7EqT/VxVD7uXvwOmZTqHiBwLvEFVf6fe/N3dwAfd7kuAFe75ioDcCCFbkWWQqHVTgljQ3jDyYWSu0ZILpUxFXgg8FHh9ooj8SUT+W0TOdbLjga7AmC4nAzhGVbe5538Fjom6kIg0iUibiLTt2LGjQOqPLMIMhiB07OkYtIJjWOrxojmLQlORDcMYCiNzjZZcKEm2mIg0A4fp9wG3AXWq2i0is4H7RGRm3POpqopIZGaCqi7DSzpnzpw5YyuDwRHM6urY09EXzIfw7K+GUxvMeBhG0YhqsT96ppqH3XMRkU8A7wMa3FQXqvq6qna7508AzwNvBbYycOpsmpMBbHfTZv702cvD8gZGMH4MJjkpOaiJpa3gaBjDyejoH5aJYTUuInIh8EXgA6p6ICCfKuKlM4nIm/AC9y+4aa+9InKWyxL7OPBTd9j9QKN73hiQG1nIVGBpGMZwMDr6h2WimKnIq4DHgJNEpEtEPgn8O3AU8EhayvE7gSdFZD3wY+DTquonAywG/gPYgufR+HGarwLvEZHNwPnutRGDXIL7hmEUi5HfPywTVkQ5BgkrsKyqrLIgvWEYkZRNEaVRvmRqRGkYhlEIzHMxDMMwsmKei2EYhlFyzLgYhmEYBSercRGRhIj803AoYxiGYYwOshoXVe0FFgyDLoZhGMYoIW77l0dF5N+BHxBYNVpV/1gUrQzDMIwRTVzjMss93hSQKfDuwqpjGIZhjAZiGRdVnVdsRQzDMIzRQ6xsMRE5RkTuEJGH3OsZrp2LYRiGYQwibiryXcDDwHHu9XPAtcVQyDAMwxj5xDUuU1T1h3gd1nCrSfYWTSvDMPpobW2lvr6eiooK6uvraW0dPUvhGqOXuAH9/SJSixfER0TOAvYUTSvDMADPsDQ1NXHggNdktKOjg6Ymt7Bbg/WCM8qXWL3FRORtwK3AKcBGYCrwEVX9c3HVKzzWW8wYSdTX19PRMXjFwmQySXt7+/ArZIxZcu0tFtdz2QS8CzgJb2WbZ7HWMYZRdDo7IxZ2i5AbRrkQ10A8pqqHVXWTqm5U1R68hcAMwygidXURC7tFyA2jXMhoXETkf4jIbOBIETlDRN7mtrkMXgA67PjlIvKyiGwMyGpE5BER2eweJzu5iMgtIrJFRJ50U3H+MY1u/GYRaQzIZ4vIBnfMLW4pZMMYNbS0tFBVNfBPraqqipaW0bPWujE6yea5vBf4BjAN+LfA9nng+hjnvwu4ME12HUQQ77wAAB0gSURBVLBGVacDa9xrgIuA6W5rApaCZ4yAJcDfAmcCS3yD5MZcGTgu/VqGMaJpaGhg2bJlJJNJRIRkMsmyZcssmG+UPXED+h9W1XuGdAGReuBBVT3FvX4WmKuq20TkWGCdqp4kIt9xz1cFx/mbql7l5N8B1rltraqe7OQLguOisIC+YRhG7hRrsbBpIvIGN3X1HyLyRxG5YIg6HqOq29zzvwLHuOfHAy8FxnU5WSZ5V4h8ECLSJCJtItK2Y8eOIaptGIZhxCWucVmoqnuBC4Ba4GPAV/O9uHpuU9HXWVbVZao6R1XnTJ06tdiXMwzDGPPENS5+oHw+cLeqbgrIcmW7mw7DPb7s5FuBEwLjpjlZJvm0ELlhjBmset8oV+IalydE5Od4xuVhETkK1wpmCNwP+BlfjcBPA/KPu6m3s4A9bvrsYeACEZnsAvkXAA+7fXtF5CyXJfbxwLkMY9TjV+93dHSgqn3V+2ZgjHIgrnH5JF5W19tV9QAwHrgi20EisgqvHuYkEelynZS/CrxHRDYD59M/vbYaeAHYAnwXWAygqruALwOPu+0mJ8ON+Q93zPPAQzHfj2GMeJqbm/vawvgcOHCAxsZG82SMkhM3W+ydYXJV/VXBNSoyli1mjBbilHVVVVVZ6rJREHLNFotrXB4IvDwCr97kCVUdcStRmnExRgvjxo2jtzd7c3LrQ2YUgqKkIqvq+wPbe/AaWL4yVCUNYyxTqCB8HMMCmfuQWUKAUSyG2nyyC/ibQipiGGOBoQThowxAMpmMdc2oPmSWEGAUk7jTYrfSX49SAcwC2lX18iLqVhRsWswoJbm20E9fzwX64ygACxcu5NChQ5HXyxRzsXb+Ri4Uq0K/DXjCbY8B/2skGhbDKDVhN/N0edBTaWxsDM0Ia25uBiD9x2EikaC2tjZWHzJr528Uk1jruajqimIrYhhjgUQiERorSSQSwGBPJSqu0tHRweWXD/5919vbS3V1NTt37syqS11dXaixs3b+RiHI1nJ/g2t/H7oNl5KGMVqIMha+PKx2JVfieh7Wzt8oJtk8l0vxGku+lCY/Aa/ppGEYMWhtbe2bygrDD84XYkoqrufhT5c1NzfT2dlJXV0dLS0tVhNjFIRsMZdv4rVh6QhuwB63zzCMLASzssIIegv5Tkll8zzSM88A2tvbSaVStLe3m2ExCoeqRm7A4xn2bch0bLlus2fPVsMYTpLJpN/9e9CWSCR00aJFfWNXrlyp48ePjxyfaUsmk7py5cpIPVauXKlVVVUDjqmqqsp4jGH4AG2aw702m+dydIZ9R+ZsyQxjDJJpqqu3t5cVK1YMqC3RGOUBYYR5HrlknhlGIclmXNpE5Mp0oYh8Ci8t2TCMLGSb6gre4Jubm+np6RnSddIr7NOLJKOSCSz12CgGGYsoReQY4CfAIfqNyRy8rsgfUtURF9S3IkpjuAkrhExHREilUlRUVAzZcwGorKzkzjvvpKGhIbJIMh0rmjTiUNAiSlXdrqp/B9wItLvtRlU9eyQaFsMoFpl6dDU0NHD22WdnPN73buIG9CdOnBgq7+np4ZprrgGiCzaDWOqxUTRyCdCMhs0C+kahyRYoX7RoUdZgvB/UX7lyZdaxIpJ1nKpqIpHIOCY9mcAwMkGOAf1YvcVGEzYtZhSabD264rTGr62tZefOnZx//vmsWbMmb51EJNb0mq33YsSlWL3FCoaInCQi6wPbXhG5VkRuEJGtAfn8wDFfEpEtIvKsiLw3IL/QybaIyHXD/V4MA7L36IrTGr+7uxugIIYF4mecZcoWs3b8Rj4Mu3FR1WdVdZaqzgJmAwfwkgYAvunvU9XVACIyA7gMmAlcCNwuIgkRSQC3ARcBM4AFbqxhDCs1NTUZ5RUVw/5nlhO+EQwakylTprBw4UJrx28MmVJ/688Dnlev6j+KS4Dvq+rrqvoisAVvJcwzgS2q+oKqHgK+78YaI5yDew5y28zbOLjnYKlVKQhHHpm9JKy2tnYYNAmnrq5uUNpyd3f3oFb+VhNj5EKpjctlwKrA68+6ppjLRWSykx3PwN5mXU4WJR+EiDSJSJuItO3YsaNw2htFYfPPNrPzqZ1sXr251KrEYteuXRnlcRpRzpo1q6A65UJ3dzeXX355LD2tJsaIS8mMi4iMBz4A/MiJlgJvxluIbBvwb4W6lqouU9U5qjpn6tSphTqtUWDu+cd7+Er1V7iv8T4A7vv4fXyl+ivc84/3lFizzESlD+eSXrx27dqC6pQL+/btiz3W2vEbcSml53IR8EdV3Q59NTW9qpoCvos37QWwFa8Ls880J4uSGyOUeTfNY1LdJCoqva9lRWUFRyePZt6X55VYs8xka10ftj+dVCpVNP0KhdXEGLlQSuOygMCUmIgcG9j3IWCje34/cJmITBCRE4HpwB+Ax4HpInKi84Iuc2ONEUrNW2qYd9M8Uj0pKidWkupJMffGudS8OTxgXi40NDSwbNkykslk3wqQjY2NNDc3U1FRQXNzM42NjVnXvBeRWNebwAQWs5gJTCiE+pFUVFT0JSMkEgnOPvvsvvdk2WNGNkpiXERkIvAe4N6A+Ov+4mTAPOCfAFR1E/BD4Cngv4DPOA/nMPBZ4GHgaeCHbqwxgtn0w01UTqxk7o1zqZxYyaYflcd/aba03IaGhr7W9S0tLaxYsWJAptWKFSuG/qv/VOBaYIn3+NYT3sobeSPTmZ7v24qktraWcePG9XlUvb29rFmzxrLHjNhYEaVRVmx9fCuT6iZRfUw1+7bvY+9LezluznEl1SmsN1im4sNMRZU7d+5k//798S9+KvB+YDxc+uNLOfnZk0n0JkikEvS6f8/wDPcO+J2WP8lk0vqSGQPItYjSjIthZCFbBX46Uc0nRYSJEyfmFEDnWvoWvqjprmHBqgVM2j2J8YfHc4hD7GY3q1jFK7yS8TRxK/bBe1+dnZ2xxvsNN43RT9lX6BvGSCNbBX46URlVqpqbYQGY1P90V+0u1s5bSyKV4PXK10mQYB3rshoW8BIG4tTSiAgdHR2xCz8te8yIwoyLYWQhW6pxOnGyw2KzZ+DLmZtm0lPZw7qz1tFDDzOI35Tiox/9aKg8mEjgeytxWtZY9piRCTMuhpGFbKnG6aRnj+XFGrzVlByPnvMot151K4/tfoxbuZXf8tvYp1q9enWoPNP0VyKR6MuAW7Ro0YCMOGt4aWTCjIthZCEs1TjbjTWYPZYXG4AHgN2Awl8m/oX9a/fDBtjPfv7CX7Kewp8OG0p1fW9vL6lUivb2ds4555ycjzfGLhbQH1G0As1AJ1AHtAD2y7HcSSQSJQ16+0kEU6ZM6eu+HJeKigp6e3tzzpgzRh+WLZaFkWtcWoEmvCbSPlXAMszAlDfV1dWh6ccTJ07MLS05D0RkyJldqppzxpwx+rBssVFLMwMNC+61daktd6IaQh44cIBEIjHk865cuZLx48fHGquqeXlPuWbMGYYZlxFD1B+x/XGXO5myzU466aQhn7ehoYHly5cXJnEgAj9ek2vGnGGYcRkxRP0R5/LH3QrU4/2317vXRrHJlG32zDPP5HXugiUOhFBZWcnNN98M5J4xZxio6pjaZs+ercVlkaom3OUS7vVQWamqSVUVVa1V1fE68O1UuTFxz1WVx/FGPixatEgTiYQCKiJaXV2tIqLAkLd08jmXiGhtba3W1taqiGgymdSVKwd+N1auXKnJZDJyvzG6Ado0h3uteS4FZTHesjR+AVqve704MCau9+AH8Dvw/v673fn8/7IE0MjAYH6mc1vMplS0trayYsWKvsJEdZX6mkcyjd+xuFDdiVOpFDt37mTnzp19qcfAgGadQJ+X1N7eblliRmZysUSjYSuu5+J7LOlbwu3PxXtIRpwr6tiVOtizGaeexyMZziH5v20jI8lkMi+vIttWVVWlK1eu1Nra2iEdX1tbO0jnlStXalVVVeh1jLEJOXouJb/ZD/dWGOMSnK5Kav8NPtOlVaMNRjLkGpkMQtixtTHHx7m2UUjynf6Ks/nTVJWVlTkdV1lZGWowogxiMpkc/g/QKAtyNS42LZYz6dNVHe51K95UVRi+PJeMr7iBev/Y3IrjPKrwCjGNfMm03kucjCq/zUrchpHpdHR00NzczKc+9alY2WN+p4E777wzdHrLUo+NvMnFEhVyA9rxmlusx1lEoAZ4BNjsHic7uQC3AFuAJ4G3Bc7T6MZvBhqzXTd/zyUZceqkesH7sH1+UD/Kuxg8LRE+hRa2JTS+l+Nv6R6XkQ/ZppDC9keNjRoTdwt6IlHTZGHTYL6efsDeTz5I38xzGbswUqbFnHGZkib7OnCde34d8DX3fD7wkDMyZwG/135j9IJ7nOyeT8503fyNS9SN3I9dZMoWy8W4pJ+rwm35fvTJIb9zI5w4U0jBG3emrKx8jUvQeORiXLIZQH9btCif7EdjJJOrcSlZ+xcRaQfmqOrOgOxZYK6qbhORY4F1qnqSiHzHPV8VHOdvqnqVkw8YF0b+7V/q8abC0kni2ctMVOD9jaYjgF+n4PcP63Dy4PjxwFHALneu7G3RB2LtYopBpsXBcq0/GUr/rzBUNSe9otq7pGPtXsYuI6n9iwI/F5EnRKTJyY5R1W3u+V+BY9zz44GXAsd2OVmUfAAi0iQibSLStmPHjjzVbsG7SQeJG7uImnuvcNsUYCH9xiv9xnAIqMYzRNluWgngPDyjJ+7RDEsxKGT1+s033xy7pUs2ctErbizFYi5GXEppXN6hqm8DLgI+IyLvDO50blhB3CpVXaaqc1R1ztSpU/M8WwPeTXooN+0wwwSeB6J4QflDIfuD+IYn242rF3jMXTOF51WZYSkGhaxeT2/pMpTeY34NzL59+6isrIylV1xDaO1ejLiUzLio6lb3+DLwE+BMYLubDsM9vuyGbwVOCBw+zcmi5EWmAe9mHeemHSxsbMbLP/AN01CaFvr/ZVGGKogVSQ4HQ1nvJdv5/GLFFStW5LyqZSqVQlXp7u5GRKitrc2qV5zVM63di5ETuQRoCrUBE4GjAs9/C1wI/CsDA/pfd88vZmBA/w/aH9B/ES+YP9k9r8l07eLWuaTvy9ayJdcsL38Lu1bUWCuSHOnEyeIC8s7wSm/vsmjRImv3YvTBSMgWA94E/Nltm4BmJ6/FW9h1M/AL31A4o3Ib8Dxe+vKcwLkW4qUobwGuyHbt/I1Lpir7uOnDSXeuZB4fY3omWtS5kmqMHjKlPUcVa4rYDwwjf3I1LrZYWM7UE50tRsS+dPzssLAFwHJloju+BtgL9AT2WXbYaKS1tZXm5mY6Ozupq6ujpaWFhoYGW9DLKCojKVtsBLEYGIdnFKKMRyfx11bxg6INeDGYoS8YBfvpTwYQPOfPssNGM8GYTLCBpLXFN8oJMy5ZSe90HEUd8Vq2BNOWW4EVMc4dl2CqcjtmWEY/wbYzzc3NNDY2FiyxwDDywabFsjKO7Dd/f/oJBk9zVQJvwCt8rMMzLP4fez3xptFyIViQaYxmWltbaWpqGrCMclVVlRkUoyjYtFjByWRY0qefwmpg7gR2Eu5NZJtGG8rStVaHMFZobm4eYFgADhw4QHOzpZ8bpWdcqRUof9JbsATlYR6Cb2TiUEe455Kgv7AyigqnQ9D4jQf2uX3pXpIx2rDOxUY5Y55LVibmKM+FTBX7UfjGrhcvXuN7SbX0B/aVgUsBGKORQradMYxCY8YlK/tzlOdC+jRanKyx4I0j2CmgmoFpyODFfhrJvqSyMRKx7DCjnDHjkpWoX4GF+nUYNBDZAvGZGmRGTYX402vmyYw2Ct12xjAKiWWLZSWs0LGQxYl+i/1OMrfRT5I5hlJPvMyzOEsDGIZhDMSyxQpOPl2Qs9FKf4t9P44SxiKy163EaWQJ8Qs9DcMwho5li8UilwywXLiG7C32AVbHGOPrl80LsmCvYRjFxzyXkhJ3xcEOcg/IH41XwBkkvTtAPRbsNwyjGJhxGTF0AFfgrVbpG4TF9BuI4CqW2XqN+XEkf6wF+w3DKCxmXIpOuocQNAi5fvw9DKxjWcpAY5I+xRbVa6yZwZ2YbWExwzAKh8Vcikp6pplvEHyGI1MvLIAfFdS3YL9hGIXBPJeiEuYhhJFgaH3E4hAWwC927Y5hGGOdYTcuInKCiKwVkadEZJOIXOPkN4jIVhFZ77b5gWO+JCJbRORZEXlvQH6hk20RkeuG+71kJ64n4BdQJrMNzJFgr7F6+mMqYWnLmQo0DcMwcqMU02KHgX9W1T+KyFHAEyLyiNv3TVX9RnCwiMwALgNmAscBvxCRt7rdtwHvAbqAx0XkflV9aljeRSyiGlOGjQPv5p5esOn3EqsFXiVz6nKwvb+/MqWfkeYH7WFw2rI1uTQMo7AMu+eiqttU9Y/u+avA08DxGQ65BPi+qr6uqi8CW4Az3bZFVV9Q1UPA993YMiJOYWPQYwgr2PxPPOOyE1ietm8R0e39o3qN+UH7YNuZdsywGIZRSEoa0BeReuAM4PfAOcBnReTjQBued/MKnuH5XeCwLvqN0Utp8r+NuE4T7mf78HaMDfMQ5uMVRUZ5DJkKNnMp5rSgvWEYpaNkAX0RqQbuAa5V1b14aVRvBmYB24B/K9S1VHWZqs5R1TlTp04t1Gljku4h3M7weAwWtDcMo3SUxLiISCWeYWlV1XsBVHW7qvaqagr4Lt60F8BW4ITA4dOcLEpuABa0NwyjlJQiW0yAO4CnVfX/BeTHBoZ9CNjont8PXCYiE0TkRGA68AfgcWC6iJwoIuPxgv73D8d7GBkUs+GmYRhGZkoRczkH+BiwQUTWO9n1wAIRmYUXvW4HrgJQ1U0i8kPgKbxMs8+oai+AiHwWeBivUGS5qm4azjdS/hSr4aZhGEZmbD0XwzAMIyu2nsuIx7oVG4Yx8rHeYmVFWC+y9MJHwzCM8sc8l7LCuhUbhjE6MONSVljho2EYowMzLmWFFT4ahjE6MONSVljho2EYowMzLmWFFT4ahjE6sGyxssMKHw3DGPmY52IYhmEUHDMuhmEYRsEx4zKqsOp+wzDKA4u5jBqsut8wjPLBPJdRg1X3G4ZRPphxGTVYdb9hGOWDGZdRg1X3G4ZRPphxGTVYdb9hGOXDiDcuInKhiDwrIltE5LriXGUkZGFZdb9hGOXDiM4WE5EEcBvwHqALeFxE7lfVpwp3lZGUhWXV/YZhlAcj3XM5E9iiqi+o6iHg+8Alhb2EZWEZhmHkykg3LscDLwVedznZAESkSUTaRKRtx44dOV7CsrAMwzByZaQbl1io6jJVnaOqc6ZOnZrj0ZaFZRiGkSsj3bhsBU4IvJ7mZAXEsrAMwzByZaQbl8eB6SJyooiMBy4D7i/sJSwLyzAMI1dGdLaYqh4Wkc8CDwMJYLmqbir8lSwLyzAMIxdGtHEBUNXVwOpS62EYhmH0M9KnxQzDMIwyxIyLYRiGUXDMuBiGYRgFx4yLYRiGUXBEVUutw7AiIjvwGoQVmynAzmG4zlAoV91Mr9wwvXLD9MqNdL2Sqhq7Cn3MGZfhQkTaVHVOqfUIo1x1M71yw/TKDdMrN/LVy6bFDMMwjIJjxsUwDMMoOGZciseyUiuQgXLVzfTKDdMrN0yv3MhLL4u5GIZhGAXHPBfDMAyj4JhxMQzDMAqOGZcCICIniMhaEXlKRDaJyDVOXiMij4jIZvc4eZj1OkJE/iAif3Z63ejkJ4rI70Vki4j8wC1XMOyISEJE/iQiD5aLXiLSLiIbRGS9iLQ5WUn/H50OR4vIj0XkGRF5WkTOLrVeInKS+5z8ba+IXFtqvZxu/+S+8xtFZJX7WyiH79c1TqdNInKtk5Xk8xKR5SLysohsDMhCdRGPW9xn96SIvC3b+c24FIbDwD+r6gzgLOAzIjIDuA5Yo6rTgTXu9XDyOvBuVT0dmAVcKCJnAV8DvqmqbwFeAT45zHr5XAM8HXhdLnrNU9VZgRz/Uv8/AtwM/Jeqngycjve5lVQvVX3WfU6zgNnAAeAnpdZLRI4HPgfMUdVT8JbjuIwSf79E5BTgSuBMvP/D94nIWyjd53UXcGGaLEqXi4DpbmsClmY9u6raVuAN+CnwHuBZ4FgnOxZ4toQ6VQF/BP4Wr+p2nJOfDTxcAn2muS/vu4EH8VZiKwe92oEpabKS/j8Ck4AXcQk45aJXmi4XAI+Wg17A8cBLQA3esiIPAu8t9fcL+AhwR+D1vwBfLOXnBdQDG7N9p4DvAAvCxkVt5rkUGBGpB84Afg8co6rb3K6/AseUQJ+EiKwHXgYeAZ4HdqvqYTekC++Pcbj5Ft4fVsq9ri0TvRT4uYg8ISJNTlbq/8cTgR3AnW4a8T9EZGIZ6BXkMmCVe15SvVR1K/ANoBPYBuwBnqD036+NwLkiUisiVcB8vGXay+n/MUoX32D7ZP38zLgUEBGpBu4BrlXVvcF96pn7Yc/7VtVe9aYtpuG54ycPtw7piMj7gJdV9YlS6xLCO1T1bXjTAJ8RkXcGd5bo/3Ec8DZgqaqeAewnbeqkVN8vABe7+ADwo/R9pdDLxQkuwTPKxwETGTz9M+yo6tN4U3M/B/4LWA/0po0p2f9jOvnqYsalQIhIJZ5haVXVe514u4gc6/Yfi+c9lARV3Q2sxZsOOFpE/FVIpwFbh1mdc4APiEg78H28qbGby0Av/1cvqvoyXvzgTEr//9gFdKnq793rH+MZm1Lr5XMR8EdV3e5el1qv84EXVXWHqvYA9+J958rh+3WHqs5W1XfixX2eo/SfV5AoXbbieVk+WT8/My4FQEQEuAN4WlX/X2DX/UCje96IF4sZTr2misjR7vmReHGgp/GMzN+XSi9V/ZKqTlPVerzplF+qakOp9RKRiSJylP8cL46wkRL/P6rqX4GXROQkJzoPeKrUegVYQP+UGJRer07gLBGpcn+b/udV0u8XgIi80T3WAZcC36P0n1eQKF3uBz7ussbOAvYEps/CGc6A1mjdgHfguY9P4rm66/HmU2vxgtabgV8ANcOs12nAn5xeG4H/4+RvAv4AbMGbyphQws9uLvBgOejlrv9nt20Cmp28pP+PTodZQJv7v7wPmFwmek0EuoFJAVk56HUj8Iz73v8nMKHU3y+n16/xDN2fgfNK+Xnh/SDYBvTgecefjNIFL+HmNryY7Qa8TLyM57f2L4ZhGEbBsWkxwzAMo+CYcTEMwzAKjhkXwzAMo+CYcTEMwzAKjhkXwzAMo+CYcTGMISIivWldgeuHcI4PuianhjGqGJd9iGEYEbymXmudfPggXmPFp+IeICLjtL9HlmGUJea5GEYBEZHZIvLfrvHlw4FWGleKyOPira1zj6se/zu8nlz/6jyfN4vIOhGZ446Z4lrkICKfEJH7ReSXwBrXTWC5eOv1/ElELnHjZjrZerfuxvTSfBLGWMeMi2EMnSMDU2I/cf3lbgX+XlVnA8uBFjf2XlV9u3pr6zwNfFJVf4vXVuML6q2L8nyW673NnftdQDNe25wzgXl4Bmoi8GngZudRzcGrvDaMYcemxQxj6AyYFnOLQZ0CPOK1tCKB114D4BQR+b/A0UA18PAQrveIqu5yzy/Aa/75P93rI4A64DGgWUSm4Rm0zUO4jmHkjRkXwygcAmxS1bND9t0FfFBV/ywin8DrqRbGYfpnFI5I27c/7VofVtVn08Y8LSK/By4GVovIVar6y/hvwTAKg02LGUbheBaYKiJng7cMg4jMdPuOAra5qbOGwDGvun0+7XhLBkN/B98wHgaudl1/EZEz3OObgBdU9Ra8jran5fWODGOImHExjAKhqofwDMLXROTPeN2x/87t/he81UkfxevW6/N94AsuKP9mvBUUF4nIn4ApGS73ZaASeFJENrnXAB8FNrrVR08B7i7ImzOMHLGuyIZhGEbBMc/FMAzDKDhmXAzDMIyCY8bFMAzDKDhmXAzDMIyCY8bFMAzDKDhmXAzDMIyCY8bFMAzDKDj/H9WuQrY8fkOCAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.scatter(df0['Age'],df0['Savings'],color='black')\n",
    "plt.scatter(df1['Age'],df1['Savings'],color='green')\n",
    "plt.scatter(df2['Age'],df2['Savings'],color='red')\n",
    "plt.scatter(df3['Age'],df3['Savings'],color='yellow')\n",
    "plt.scatter(km_model.cluster_centers_\n",
    "           [:,0],km_model.\n",
    "            cluster_centers_[:,3],\n",
    "           color='purple',marker='*',\n",
    "           label='centroid')\n",
    "\n",
    "plt.xlabel('Features')\n",
    "plt.ylabel('Cluster')\n",
    "plt.legend()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Using Silhouette Plot\n",
    "+ The silhouette score is a measure of how similar an object is to its own cluster (cohesion) compared to other clusters (separation)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {
    "jupyter": {
     "source_hidden": true
    }
   },
   "outputs": [],
   "source": [
    "from sklearn.metrics import silhouette_samples, silhouette_score\n",
    "\n",
    "def silhoutte_kmeans_plot(X,range_n_clusters):\n",
    "#     range_n_clusters = [2, 3, 4, 5, 6]\n",
    "    for n_clusters in range_n_clusters:\n",
    "        # Create a subplot with 1 row and 2 columns\n",
    "        fig, (ax1, ax2) = plt.subplots(1, 2)\n",
    "        fig.set_size_inches(18, 7)\n",
    "\n",
    "        # The 1st subplot is the silhouette plot\n",
    "        # The silhouette coefficient can range from -1, 1 but in this example all\n",
    "        # lie within [-0.1, 1]\n",
    "        ax1.set_xlim([-0.1, 1])\n",
    "        # The (n_clusters+1)*10 is for inserting blank space between silhouette\n",
    "        # plots of individual clusters, to demarcate them clearly.\n",
    "        ax1.set_ylim([0, len(X) + (n_clusters + 1) * 10])\n",
    "\n",
    "        # Initialize the clusterer with n_clusters value and a random generator\n",
    "        # seed of 10 for reproducibility.\n",
    "        clusterer = KMeans(n_clusters=n_clusters, random_state=10)\n",
    "        cluster_labels = clusterer.fit_predict(X)\n",
    "\n",
    "        # The silhouette_score gives the average value for all the samples.\n",
    "        # This gives a perspective into the density and separation of the formed\n",
    "        # clusters\n",
    "        silhouette_avg = silhouette_score(X, cluster_labels)\n",
    "        print(\"For n_clusters =\", n_clusters,\n",
    "              \"The average silhouette_score is :\", silhouette_avg)\n",
    "\n",
    "        # Compute the silhouette scores for each sample\n",
    "        sample_silhouette_values = silhouette_samples(X, cluster_labels)\n",
    "\n",
    "        y_lower = 10\n",
    "        for i in range(n_clusters):\n",
    "            # Aggregate the silhouette scores for samples belonging to\n",
    "            # cluster i, and sort them\n",
    "            ith_cluster_silhouette_values = \\\n",
    "                sample_silhouette_values[cluster_labels == i]\n",
    "\n",
    "            ith_cluster_silhouette_values.sort()\n",
    "\n",
    "            size_cluster_i = ith_cluster_silhouette_values.shape[0]\n",
    "            y_upper = y_lower + size_cluster_i\n",
    "\n",
    "            color = cm.nipy_spectral(float(i) / n_clusters)\n",
    "            ax1.fill_betweenx(np.arange(y_lower, y_upper),\n",
    "                              0, ith_cluster_silhouette_values,\n",
    "                              facecolor=color, edgecolor=color, alpha=0.7)\n",
    "\n",
    "            # Label the silhouette plots with their cluster numbers at the middle\n",
    "            ax1.text(-0.05, y_lower + 0.5 * size_cluster_i, str(i))\n",
    "\n",
    "            # Compute the new y_lower for next plot\n",
    "            y_lower = y_upper + 10  # 10 for the 0 samples\n",
    "\n",
    "        ax1.set_title(\"The silhouette plot for the various clusters.\")\n",
    "        ax1.set_xlabel(\"The silhouette coefficient values\")\n",
    "        ax1.set_ylabel(\"Cluster label\")\n",
    "\n",
    "        # The vertical line for average silhouette score of all the values\n",
    "        ax1.axvline(x=silhouette_avg, color=\"red\", linestyle=\"--\")\n",
    "\n",
    "        ax1.set_yticks([])  # Clear the yaxis labels / ticks\n",
    "        ax1.set_xticks([-0.1, 0, 0.2, 0.4, 0.6, 0.8, 1])\n",
    "\n",
    "        # 2nd Plot showing the actual clusters formed\n",
    "        colors = cm.nipy_spectral(cluster_labels.astype(float) / n_clusters)\n",
    "        ax2.scatter(X[:, 0], X[:, 1], marker='.', s=30, lw=0, alpha=0.7,\n",
    "                    c=colors, edgecolor='k')\n",
    "\n",
    "        # Labeling the clusters\n",
    "        centers = clusterer.cluster_centers_\n",
    "        # Draw white circles at cluster centers\n",
    "        ax2.scatter(centers[:, 0], centers[:, 1], marker='o',\n",
    "                    c=\"white\", alpha=1, s=200, edgecolor='k')\n",
    "\n",
    "        for i, c in enumerate(centers):\n",
    "            ax2.scatter(c[0], c[1], marker='$%d$' % i, alpha=1,\n",
    "                        s=50, edgecolor='k')\n",
    "\n",
    "        ax2.set_title(\"The visualization of the clustered data.\")\n",
    "        ax2.set_xlabel(\"Feature space for the 1st feature\")\n",
    "        ax2.set_ylabel(\"Feature space for the 2nd feature\")\n",
    "\n",
    "        plt.suptitle((\"Silhouette analysis for KMeans clustering on sample data \"\n",
    "                      \"with n_clusters = %d\" % n_clusters),\n",
    "                     fontsize=14, fontweight='bold')\n",
    "\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "+ DBSCAN stands for Density-Based Spatial Clustering of Applications with Noise\n",
    "+ It refers to an unsupervised learning methods that identify distinctive groups/clusters in the data, based on the idea that a cluster in data space is a contiguous region of high point density, separated from other such clusters by contiguous regions of low point density.(kdnuggets)\n",
    "+ \n",
    "\n",
    "+ DBSCAN is a base algorithm for density-based clustering. \n",
    "+ It can discover clusters of different shapes and sizes from a large amount of data\n",
    "\n",
    "#### Usefulness\n",
    "+ Unsupervised ML\n",
    "+ Outlier Detection\n",
    "+ Noise detection\n",
    "\n",
    "\n",
    "#### Terms\n",
    "The DBSCAN algorithm uses two parameters:\n",
    "+ eps (ε): A distance measure that will be used to locate the points in the neighborhood of any point.\n",
    "\n",
    "+ minPts: The minimum number of points (a threshold) clustered together for a region to be considered dense.\n",
    "    - minPt >= Dimensions of dataset + 1\n",
    "\n",
    "+ Core point — This is a point that has at least m points within distance n from itself.\n",
    "+ Border — This is a point that has at least one Core point at a distance n.\n",
    "+ Noise — This is a point that is neither a Core nor a Border. And it has less than m points within distance n from itself.\n",
    "\n",
    "These parameters can be understood if we explore two concepts called Density Reachability and Density Connectivity.\n",
    "\n",
    "Reachability in terms of density establishes a point to be reachable from another if it lies within a particular distance (eps) from it.\n",
    "\n",
    "Connectivity, on the other hand, involves a transitivity based chaining-approach to determine whether points are located in a particular cluster. For example, p and q points could be connected if p->r->s->t->q, where a->b means b is in the neighborhood of a.\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](dbscan_graph.png)\n",
    "\n",
    "![](dbscan_terms.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Scaling Dataset\n",
    "+ StandardScaler\n",
    "+ MinMaxScaler"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Clustering using BIRCH"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "+ BIRCH stands for Balanced Iterative Reducing and Clustering using Hierarchies\n",
    "+ It is a hierarchical clustering algorithm. \n",
    "+ It provides a memory-efficient clustering method for large datasets.\n",
    "+ Very good for large dataset\n",
    "+ Clustering is conducted without scanning all points in a dataset. \n",
    "+ The BIRCH algorithm creates Clustering Features (CF) Tree for a given dataset and CF contains the number of sub-clusters that holds only a necessary part of the data. Thus the method does not require to memorize the entire dataset.\n",
    "\n",
    "\n",
    "#### Terms\n",
    "+  branching_factor: it defines the number of sub-clusters and \n",
    "+ threshold: it sets the limit between the sample and sub-cluster.\n",
    "\n",
    "\n",
    "#### Benefit\n",
    "+ Useful for large dataset\n",
    "+ BIRCH can work with any given amount of memory, and the I/O complexity is a little more than one scan of data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Clustering using Hierarchical Clustering\n",
    "+ Hierarchical Clustering: the process of building a hierarchy or ordering of clusters till only one cluster is left\n",
    "    \n",
    "#### Types of Hierarchical Clustering\n",
    "+ Additive(Agglomerative) hierarchical clustering\n",
    "    - Agglomerate (merge or join)\n",
    "    - Assign each point to a cluster\n",
    "    - Merge/Join closes pairs of clusters into one\n",
    "    - Repeat until you have a single cluster\n",
    "    - You keep on adding or joining clusters \n",
    "\n",
    "+ Divisive hierarchical clustering\n",
    "    - Opposite of Additive\n",
    "    - Start with a single large cluster\n",
    "    - Divide/Split the farthest point in the cluster\n",
    "    - Repeat until each cluster only contains a single data point\n",
    "    \n",
    "    \n",
    "#### Terms\n",
    "+ Similarity Distance\n",
    "+ Proximity Matrix: it stores the distances between each point\n",
    "+ Dendogram: used to find the number of clusters\n",
    "    - A dendrogram is a tree-like diagram that records the sequences of merges or splits.\n",
    "    - The number of clusters will be the number of vertical lines which are being intersected by the line drawn using the threshold."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Agglomerative Hierarchical Clustering\n",
    "+ Merging clusters\n",
    "+ Dendogram to detect n_clusters to use and end at"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [],
   "source": [
    "#### Thanks For Watching\n",
    "### Jesus Saves @JCharisTech\n",
    "### By Jesse E.Agbe(JCharis)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
